rank | frequency | n-gram |
---|---|---|
1 | 590277 | -k |
2 | 549752 | -t |
3 | 478136 | -l |
4 | 307393 | -n |
5 | 295076 | -a |
rank | frequency | n-gram |
---|---|---|
1 | 157485 | -ak |
2 | 148697 | -ek |
3 | 117363 | -ól |
4 | 107766 | -al |
5 | 99650 | -en |
rank | frequency | n-gram |
---|---|---|
1 | 122908 | -nak |
2 | 87315 | -nek |
3 | 79402 | -ban |
4 | 53518 | -ben |
5 | 48474 | -kat |
rank | frequency | n-gram |
---|---|---|
1 | 30744 | -ként |
2 | 29872 | -ának |
3 | 23088 | -kkal |
4 | 21281 | -ával |
5 | 19505 | -ában |
rank | frequency | n-gram |
---|---|---|
1 | 10180 | -,hogy |
2 | 9725 | -ással |
3 | 9703 | -okkal |
4 | 8869 | -ekkel |
5 | 7173 | -éssel |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings